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Abstract 

We consider the moment space Ai^ corresponding to p x p complex matrix measures 
defined on K [K = [0, 1] or K = T). We endow this set with the uniform distribution. 
We are mainly interested in large deviations principles (LDP) when n — > oo. First we fix 
an integer k and study the vector of the first k components of a random element of A4^. 
We obtain a LDP in the set of fc-arrays of p x p matrices. Then we lift a random element 
of into a random measure and prove a LDP at the level of random measures. We 
end with a LDP on Cartheodory and Schur random functions. These last functions are 
well connected to the above random measure. In all these problems, we take advantage of 
the so-called canonical moments technique by introducing new (matricial) random variables 
that are independent and have explicit distributions. 
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1 Introduction 



1.1 Preliminary: some notations 



All along this article, p will be a positive integer, and p = 1 will be referred as the scalar case. 
We denote respectively by S P (C) the set of all Hermitian p x p matrices and by <S+(C) the one 
of all Hermitian nonnegative p x p matrices. If A, B £ S P (C) we write A < B (resp. A < B) if, 
and only if , B — A is nonnegati ve (resp. positive) definite . This is the so-called Loewner partial 
order on S P (C) (see for example iHorn and Johnson! (119851 )). We recall that every A £ iS+(C) has 
a unique nonnegative square root denoted by A x l 2 £ <S+(C). The set of all p x p unitary matrices 
is denoted by U(p). 

Let K be either [0,1] or T := {z £ C : \z\ = 1}. A matrix- valued probability measure on K is a 
measure /i on K with values in <S+(C) such that 



dfj, 



A 



where I p is the pxp identity matrix. We denote by V(K) the set of all matrix- valued probability 
measures on K. In general, if (X, A) is a measurable space, we denote by M^A) the set of all 
probability measures on X. We equip it with the weak convergence topology. This is the coarsest 
topology such that the mappings fi \-> J f(x)dfi(x) are continuous. Here, / £ C&(X) (the space 
of bounded continuous functions on X) is arbitrary (see iBergl (120081 ) for completeness). 

One of the main objects of interest in our work is, for n £ N, the matricial moment space Ai^ 
defined by 



'1.1^ 



Mi 




fi £ V{K\ 



This i s a compact set ha ving a nonempty inte r ior - denoted by IntA^ - (see 
J2OO2I) for K = [0, 1] and bette and Wagenerl (120 lp[ ) for K = T). 



Dette and Studden 



1.2 What is done in this paper? 

The aim of our work is to give a picture of the asymptotic behaviour of the set sequence (Ai 1 ^). 
More precisely, we first equip the set with the uniform distribution Pa>. Then, for k < n, we 
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consider F K ^ k the pushforward probability of f Kin under the projection on M^- We study, for 
fixed k, the exponential convergence of (WK,n,k)n when n goes to infinity. The asympt otic behavior 



of (¥fc . n ,k)n was widely studied in the scalar case beginning with the seminal paper of lChang et al. 



( 119931 ) where a central limit theorem (CLT) for (P[o,i], n ,fc) is proved. Roughly speaking, (P[o,i],n,fe)n 
converges to the degenerate distribution concentrated on the k first moments of the non sym- 
metric arcsine law and there a re Gaussian fluctuations aro u nd th is limit. In the same frame, 
large deviations are studied in iGamboa and Lozada-Changi (120041 ) . In these papers, the main 
ingredient for obtaining asymptotic results is a clever reparametrizati on of .Mn' 1 ^. The new pa- 
rameters, defined recursively, are the so-called canonical moments (see iDette and Studdenl (119971 ) 
for a complete overview). Informally, given the the k — 1 first moments, the k-th canonical mo- 
ment is the relative position of the k-th. moment in the range (interval) of possible k-th. moments. 
This allows for fixed n, to define a bijection between Int.Mn and (0, l) n . The key property is 
that the pushforward of the rather involved probability measure P[o,i],n,fc under this mapping is a 
pr oduct mea s ure, i .e. the canonical moments are independent. This is an old result first show ed 
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Skibinskyl ( 119691 ) (a simple proof is given in the first chapter of IDette and Studdenl (119971 ) ) . 



Moreover, extensions of t he asymptotic results on (FK, n ,k)n at the level process are studied in 
Dette and Gamboa Also in the scalar case, and using a suitable cousin reparametrization 



(also called ca nonical moments or V erblunsky coefficients) a CLT and large deviation are tackled 



for 



Lozada-Changi ( 120051 ). In this last paper, a step toward a multidi mensional setting, 
that i s replacing [0, 1] by [0, l] d (d > 1), is also done. In a more recent work IDette and Nagel 
( 120101 ) extend some of the asymptotic results previously described to the matricial moment prob- 
lem on [0,1] (p > 1). As a matter of fact, by us i ng the right extension of canonical moments 
proposed and first studied in IDette and Studdenl ( 120021 ). it is shown there that a CLT holds. 
As before, the key property is the independence, under the uniform distribution on .Mn , of 
the matricial canonical moment vector. Here, we revisit these results and obtain new asymptotic 
result on Ai^ . First, we obtain a CLT when K = T. Further, we show large deviations principles 
(LDP) in both cases, K = [0, 1] and K = T. These LDPs are at level 2, that means that they hold 
for sequences of distributions of random matricial measures having uniform matricial moments. 
The main tool is more or less similar as the one used in the scalar case, namely the stochastic 
independence of the matricial canonical moment. Nevertheless, the matricial case appears to be 
more technical and due to non commutativity needs more care. Moreover, thanks to the general 
invariance Proposition 13.51 the complex case (K = T) is tackled by using a polar decomposition 
argument. 

Besides, it is well known that the truncated trigonometrical problem is connected to two problems 
of functional analysis on the disc: the so-called Caratheodory and Schur problems, respectively. 
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Let us explain the setting in the scalar case, although our results will be in the general matrix 
case. An analytic function, F, on D := {z G C : \z\ < 1} is called a Caratheodory function iff 
F(0) = 1 and dtF(z) > for all zGD. Let C\ be the set composed by all these functions. An 
analytic function / on D is called a Schur function iff sup^ gD \ f(z)\ < 1. Let ©i be the set of all 
Schur functions. The correspondence 

(12) F(z)= 1 + Zf{z) Hz) = - F{Z) ~ 1 

1 ] [) l-zf(z) ' J[) zF{z) + l 

is one-one between C\ and ©i. Any F G C\ has a representation 

(1.3) f(z) = [ 

Jt e ~ z 

for a unique probability measure /i on T (Herglotz representation theorem). The Taylor expansion 
of F is 

oo 

(1.4) F(z) = l + 2j2cn(F)z n 

i 

where the c n 's are the conjugate moments of /x, i.e. 

c n (F)= [ e- in9 dv(d) = %. 

T 



The classical Caratheodory problem is to find F G C\ such that the first n Taylor coefficients 
coincide with given numbers c 1; . . . , c n . It is clearly equivalent to the truncated moment problem. 
The Taylor expansion of / is 



;i-s) f(z) = j2s n (f)z n . 



The Schur problem is to find a Schur function f(z) such that the first n Taylor coefficients coincide 
with given numbers s , . . . , The set 

S£ := {(*„(/), •• fee t } 

is a compact subset of C n . In the general matrix case, we will study the impact of uniform 
sampling on the space of Taylor coefficients of these functions. These results are new, even in 
the scalar case. 

One of the main objects of random matrix theory is to obtain asymptotic results in the limit of 
large size. Here, on the contrary, the size p of matrices is fixed but the dimension n of the array 
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of matrices tends to infinity. At first insight, these two topics are very distinct. Nevertheless, 
even in the case p — 1, there is a co nnection between the rando m moment problem and the 



random matrix theory, as described in iGamboa and Rouaultl (120101 ) . Let us formulate it shortly 



in the generic situation. The spectral measure of the pair consisting of a n x n matrix (unitary or 
Hermitian) and a fixed vector is a discrete measure. It can be described either by its locations (n 
points) and its weights, or by a convenient array of its moments. When the matrix is random, both 
representations have remarkable distributions, and the asymptotical behaviour can be considered 
from two points of view. If now we fix p orthonormal vectors instead of only one, we obtain a 
random matricial spectral measure and we may consider the array of its (matricial) moments. 
This asymptotics will be treated in a forthcoming paper. 

The paper is organized as follows. Section 2 is devoted to the case K = [0, 1]. It begins with 
useful definitions and properties around LDPs and ends with the main result on level 2 LDP 
(Theorem 12.81) . Section [3] is devoted to the case K — T. We first show a CLT (Theorem 13.61 
and Corollary 13.71 ) and then turn to large deviation results (Corollaries 13.81 and 13.91 Theorem 
I3.10p . In Section 4, we establish a LDP for random Caratheodory functions and random Schur 
functions, respectively (Theorem 14.11) . All technical proofs are postponed to Section 



2 Matrix measures on [0, 1 

Here, we will work on K = [0, 1] and the set defined in (11. ip is 
(2.1) 

-M' ' 11 == {Sn = (Si, ...,S n )\ Sj := jf x*dfi(x), j = 1, . . . , n fx e 7>([0, 1])} c (5+(C)) n , 
he moment spac e is a compact subset of (iSj~(C)) n with nonempty interior 



( iDette and Studded fl2002f )). Therefore the uniform distribution U(Ain'^) is well defined by 



the density 



(2.2) ([ dS^.-dSn) I{S n eM [ °' 1] } 

\Jm [ ° a] / 



with respect to dSi ■ ■ ■ dS n where, if S — (sj 



l JH,j=l 



(2.3) ds = n ds% n ds% , 

i<j<n 

where for s G C, s := + is^ is the standard decomposition of s in real and imaginary parts. 
The main tool to study random moments S n ~ U(Ain^) are the canonical moments which are 
introduced in the next section. 
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2.1 Canonical moments for matrix measures on [0, 1] 

For a moment vector (Si, . . . , S n ) G -Mn' 1 ^ we build the block Hankel matrices 



(2-4) H 2m 



and 



(2.5) H_2m+1 '- z 



S \ 

J 



s 



2m 



( S ■ 

y S m 

^ Si ■ ■ • S m+ i 

\ S m+ i . . . S2 m +l J 



H 



2m 



^ Si — S2 ■ ■ ■ S m — S m+ i 

V S m S m +i . . . 02m— 1 S2- m y 



/ S - Si 



2m+l 



S m S m +i ^ 



Sm Om+1 • • • Som S 



2m. ~~ iJ2m+l / 



Dette and Studded (j2002l ) showed that the point (Si, . . . , S n ) is in Int-Mn' 1 ' if, and only if, the 
matrices H_ n and H n are both positive definite. 

For (Si, . . . , S n ) G Int(7Wl°' 1] ) we define 



h 



2m 



h* 

L±2m-1 



°2m 



h* 

u 2m-l 



and consider the p x p matrices 



(2.6) 
(2.7) 



0+ 

J n+1 



(S m +l, • ■ ■ , S2 m ) 

(S m , • ■ " , S2 m _l) 

(S m S m +1, " " " , S2m— 1 S2m) 
(S m — S m +1, " " " , S2 m -2 — S2 m -l y 



S n - h n H~\h n , n>2 



(for the sake of completeness we also define Sf = and S^ = I p , S£ = Si). Note that 
S~ +1 and S+ +1 are continuous functions of (Si, . . . , S n ) and that S~ < S n < S+ if and only if 
(Si, . . . , S n ) G IntAIn' 1 '. These preliminary notations allow to introduce the canonical moments 
of a matrix measure on [0, 1]. 

Definition 2.1 For S n = (Si, . . . , S n ) G Int A^L 0,11 we define the canonical moments by 
(2.8) [4 = (5+ - S-y^iS, - S fc -)(S+ - S-)" 1 / 2 , fc = 1, . . . , n . 
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It is clear that each Uk £ <S P (C) and satisfies O p < < I p . Therefore we can define a mapping 

<pW : IntA^k°' 1] — ► (0 P , 7 p ) n , 
y (n) (S n ) =U n = (C/i,..., £/■„). 



(2.9) 



By equation (12.81) . the ordinary moments can be recursively calculated from the canonical 
moments and the mapping (f^ n > is one-to-one. No w consider a ran d om y ector of moments 
S n ~ U(M [ n 1] ), then S n G lntM [ n 1] almost surely, bette and Nagell f1201oh showed that the 
corresponding canonical moments U n = y?( n )(S n ) are independent and that Uk € «S^"(C) follows 
a complex matricial distribution Beta p (p(n — k + l),p(n — k + 1)) where for a, b > p — 1 the 
distribution Beta p (a, b) has the density (with respect to dX) 



(2.10) 



B p (a, b)- 1 (det X) a ~ p (det(I p - X)) 



b—p 



sec 



(2.11 



Khatril (jl965[ ) or iPillai and Jourisl (119711 )]. The normalizing constant B p (a,b) is defined by 

T p (a)T p (b) 



B p (a,b) :-- 



a, b > p — 1 . 



T p (a + b) 

Here T p (a) denotes the complex multivariate Gamma function 

v 



T p {a) 



71 



P(P- 



-^JJrCa-i + i) 



a > p — 1. 



i=l 



The matricial Beta distribution is one of the three main distributions of complex Hermitian 
matrices, together with the Gaussian unitary ensemble GUE P having the density 



(2.12) 



(2vr ? 



- P /2 e -tr|X 2 



and the complex Wishart distribution W p (a) with density 



(2.13) 



r p (a)- 1 (detX) a - p e _trX , 



a > p — 1. 



We refer to 



Mehtal (120041 ) and iForresterl (120101 ) for more on these distributions. The following 



result shows that the Wishart distribution and the Gaussian distribution appear as weak limits 
of the matricial Beta distribution when the parameters tend to infinity. 



Theorem 2.2 Let (a n ) n be a sequence of positive parameters such that lim n _ 



oo. 
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(i) IfX n ~ Beta p (a n , a n ), then 



y/8a n (X n — |l p ) > GUE p . 



(ii) Let c > p — 1. // X n ~ Beta p (c, a n ) i/ien 



v 



a n X n > W p (c) . 



The first statement shows that the centered rescaled c anonical moments conve rge in distribution 
to the GUE p . This is the keystone to obtain a CLT in lDette and Nagell ( 120101 ). Notice also, that 
this implies that the sequence (X n ) converges in probability towards |i" p . The second statement 
will play an important role in the study of matrix measures on T. 



2.2 Large deviations 

To make this paper s e lf con tained let us first recall what is a LDP. For more on LDP we refer to 



Dembo and Zeitouni 



(jl998 ). Let (u n ) n be an increasing positive sequence of real numbers going 



to infinity with n. 

Definition 2.3 Let U be a Hausdorff topological space and B(U) its Borel a -field. We say that 
a sequence {Q n ) n of probability measures on (U,B(U)) satisfies a LDP with speed (u n ) and rate 
function / : U — > [0, oo] if: 

i) I is lower semicontinuous. 

ii) For any measurable set A ofU: 

-J(Int A) < lim infix" 1 log Q n (A) < limsupw" 1 logQ n (A) < -J(CloA), 
where 1(A) = inf^ e ^/(^) and CloA is the closure of A. 

If we omit to give the speed it means that u n = n. We say that the rate function I is good if its 
level sets {x £ U : I(x) < a} are compact for any a > 0. More generally, a sequence of U -valued 
random variables is said to satisfy a LDP if their distributions satisfy a LDP. 



We w ill need the following well known large deviation result (see e.g. 



Dembo and Zeitouni 



( 119981 ) chapter 4 p. 126 and 130). 
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Contraction principle. Assume that (Q n ) n satisfies a LDP on (U, B(U)) with good rate function 
I and speed (u n ) . Let T be a continuous mapping from U to another Hausdorff topological space 
V . Then Q n o T~ l satisfies a LDP on (V, B(V)) with speed (u n ) and good rate function 

I\y) = inf I(x), (yeV). 

x:T(x)=y 

The so-called cross entropy (or Kullback information) plays an important role in the interpre- 
tation of some of our results, for the sake of completeness we recall its definition. 



Kullback Information. Let P and Q be probability distributions on (U,B(U)). The Kullback 
information of P with respect to Q is 



K(P;Q):-- 



f dP 

J log — dP, if P < Q and log g G L l (P) 



oo otherwise. 

Our first result is a LDP for matricial beta distributions. For the case where the matrix dimension 
tends to infinity, various LDPs can be found in the literature, see for example 



Hiai and Petz 



( 120061 ) . Here we are intersted in the case of fixed dimension and growing parameters. 
Theorem 2.4 Let clq, a > and c > p — 1. Further set, for n > 1, a n := ao + an. 
(i) Let B n ~ Beta p (a„, a n ). Then B n satisfies a LDP with good rate function 

m f-alogdetCB - B 2 ) - 2aplog2, if 0„ < B < J_, 

(2.14) I ( b ] (B) = { P 

oo otherwise. 



(ii) Let B n ~ Beta p (c, a n ). Then B n satisfies a LDP with good rate function 



(2.15) 1%\B) 



— alogdet(/ p — B), if P < B < I p , 
oo otherwise. 



Remark 2.5 For the sake of simplicity we show a LDP only for very special sequences of pa- 
rameters. This is enough to obtain our further results. However, the result holds for arbitrary 
sequences a n oo. 

As a consequence of the last theorem, a LDP for the random matricial vector = (U\, . . . , Uk) 
of the first k canonical moments associated to a random matricial vector S n uniformly drawn 
holds. Indeed, as mentioned before, the components of Uj^ = (Ui, . . . , Uk) are independent, so 
that we obtain: 
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Corollary 2.6 Let S n ~ U(AA n ' ) and for k fixed, let Ujj. denote the projection of U n = 
<^ n )(S n ) onto the first k coordinates. Then the sequence fuj^M satisfies a LDP in (iS+(C)) fc 
wzt/i good rate function 

\-J2plogdet(U i -U?)-2kp 2 \og2, ifU k e(O p ,I p )\ 
(2.16) Xu(U fc ) = < i=1 

oo otherwise. 



Obviously the rate function Xy achieves its minimum value at = (\l p , ■ ■ ■ , \l p ) that appears 
as discussed before for general sequences of matricial beta distributed random matrices, see 
Theorem 12.21) as the limit of Ujj. . Notice also that the constant infinite sequence Up. = |/ p , 
k > 1 is the moment sequence of the matrix arcsine law v p defined by 

dx 

(2.17) du x {x) = — - , dv p (x) = du x {x)I p , (p > 1) , 

7Ta/x(1 — X) 



sec 



Dette and Nagell <l2Q10h . 

Now, the vector of ordinary moments (Si, . . . , S k ) is a continuous function of the canonical mo- 
ment vector Ujg. . So we obtain the following Corollary from Corollary 12. 6l by a simple application 
of the contraction principle and the identity 

k 

(2.18) det(S+ +1 - fl^i) = det J] U t (I p - U t ) 



i=l 



sec 



Dette and Studdenl (120021 )) 



Corollary 2.7 Lei S n ~ U(M [ n ,l] ) and for k < n let Sjj. denote the projection of S n onto £/ie 
/irsi fc coordinates. Then Sj^ satisfies a LDP with good rate function 



(2.19) X s (S fc ) 



-plogdet(S+ +1 - S^ +1 ) - 2kp 2 \og2, ifS k e IntA^ 
oo otherwise. 



[0,1] 



We end this section with a LDP for random matrix measures on [0, 1]. For this purpose, for every 
n let P n denote any probability measure on V([0, 1]) such that the pushforward by the mapping 

fi e V([0, 1]) H- SM = (SM, . . . , SM) e 
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Theorem 2.8 The sequence (P„) n satisfies a LDP in M 1 ('P([0, 1])) with good rate function 

pf logdetW(x) du x (x), ifu 1 {destW = 0} = 0, 
Jo 



(2.20) 



X [0,l]O) 



OC 



otherwise. 



where dfi(x) = W(x)du p (x) + dfi s (x) is the Lebesgue decomposition^ of \x with respect to v v as 
matricial measures on [0, 1] (v\ and v v are the arcsine measures defined by (I2.17P ). 

Remark 2.9 1. When p = 1 (scalar case) the rate function is also 



(2.21] 



The matricial case has also an interpretation in terms of cross-entropy which we hope to 
address in a future work. 

2. A cousin result of Theorem Iff.gl holds in the frame of real matrix measures. In this case 
the constant p in the rate function is repl a ced by All arguments remain essentially 



unchanged and we refer to \Dette and Nageh 1201 u) for the underlying results on real matrix 
valued random moments and the corresponding canonical moments. 

3. From Theorem \2.8\ and Corollary 2. 7 together with the contraction principle one easily 
obtains the following identity of rate functions. For = (Si, . . . , Sfc) G Int Ain' 1 ^ we have 

(2.22) Xs(S fc ) = -plogdet(S'+ +1 - S^ +1 ) - 2kp 2 log2 = inf -p [ logdet W(x)du 1 (x), 

V ( S k) Jo 

where 

(2.23) V(S k ) = Le V([0, 1]) | J x j dfi(x) = Sj, j = l,...,k 
and W is defined as in Theorem \2.S\ 



Robertson and Rosenberg! (|196a ) on Lebesgue decomposition for matricial measures 
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3 Matrix measures on T: the trigonometric case 

In this section, we consider the space V(T) of matrix-valued probability measures on the unit circle 
T. In what follows Tj denotes the j— th trigonometric moment of a matrix measure \x G V(T), 
that is 

e i]9 dfi(6) 

-IT 

and for neN and p > 1 the set defined in (11. ip is 

(3.2) Ml ■.= m, . . . , r n )| r, = r^), /i g p(T)} c (c pxp ) n . 

Unlike to moments of matrix measures on [0, 1], the moment Tj is no more Hermitian. Therefore 
we use the following Lebesgue measure on C pxp . For X G C pxp define 

(3.3) dX= Y[ dx%dx%. 

l<i,j<p 

3.1 Canonical moments on T 

As in the above section we use a notion of canonical moments to study M.^- First, for 
(Ti, . . . , r„) G .M^, we build the block Toeplitz matrix 



(3.4) T n := (IV-,; 



i,j=0,...,n 



Dette and Wagenerl (j2010l ) showed that (Fx, . . . , T n ) G Int Ai^ if and only if T n > 0. Therefore 
this interior is non empty. Furthermore they proved that for (Ti, . . . , T n ) G Int Ai^ the range of 
the moment T n+1 is the set 

(3.5) K n = {W G C pxp | L~ l/2 {W - M n )R~ 1/2 = U, UU* < I p } , 
where the matrices L n , R n and M n are defined by 

(3.6) L n ■.= [i p - (iv, . . . , r n ) t~\ (r l5 . . . , r re )*] , 

(3.7) R n := [i p - (r_ n , . . . , r_o t~\ (r_ n , . . . , r_o*] , 

(3.8) m„ := (r l5 . . . , r n ) T~h x (r_ re , . . . , r_i)* , 

respectively. In this frame, canonical moments are defined by normalizing the moments in the 
following way. 
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Definition 3.1 For (r 1; . . . , T n ) G Int we define the canonical moments Aj, j — 1, ■•• ,n 

setting 



,n) ■ 



(3.9) Ai := F±, A,: Ljl^ M s J/^', 2 (j = 2 

The canonical moments of a matrix measure always lie in the set 

(3.10) D p = {U G C pxp | UU* < I p } 

and coincide with the well known Verbl unsky coeffici ents appearing in the Szego recursion of 



orthonormal matrix polynomials (see e.g. ISimonl ( 120051 ) Section 2.13). They are connected to the 



trigonometric moments by a one-to-one mapping ^ n > : Int — > Int recursively defined by 
Definition 13.11 

We now state a Taylor expansion of the inverse of the mapping ip( n \ Here and in the following 
II Mil always denotes the Frobenius norm of the complex entries matrix M, that is 



||M|| := tr(M*M) 



1/2 



Lemma 3.2 Let n G N + and A n = (Ai, . . . , A n ) G Int D™. The mapping (^ ( " ) )~ 1 : A n H> X n = 
(r 1; . . . , T n ) induced by the definition of canonical moments has an order one Taylor expansion 
at 0. Namely, 

(3.11) X n = A n + o(||A n ||). 

In the following this Taylor expansion will be used to derive results concerning trigonometric 
moments from results obtained for canonical moments. 



3.2 Weak convergence in the trigonometrical case 

As in the real case we define a uniform distribution U^Ai^) on Ai^ by the density 

(3.12) (f dT 1 ...dT n ) l{X n eM T n }, 

\JMl J 

now with respect to the measure f l3.3p .We first state a result on the distribution of the canonical 
moments when the corresponding trigonometric moments are uniformly distributed. 
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Lemma 3.3 Let X n ~ U(Ml) and A n = (A u . . ., A n ) = ip {n) (X n ) e (B p ) n denote the corre- 
sponding vector of canonical moments. Then A%, . . . , A n are independent and for k = 1, . . . , n, 
Ak has density 

(3.13) ^det(/ p -^ fc ) 2 ^- fc ) 



(n) 

(n) 



with respect to ( Iff. 3]) . where c k n is a normalizing constant. 

We now establish a relation between the Hermitian random matrices from Section [2] and matricial 
random variables without symmetry condition: 

Theorem 3.4 If A k is a random matrix with density A3.13}) , then 

(3.14) A k ® VBl /2 

where V and Bk are independent, V is Haar distributed in U(p) and Bk follows a multivariate 
complex Beta distribution Beta p (p, 2p(n — k) + p) (see \2.l0i) . 

The previous theorem is a particular case of the following general variable change result. It is 
quite natural and useful in other asymptotica l problems involv i ng ra ndom complex matrices. 
Similar arguments have been used recently by iFischmann et al.l (120111 ) to generate matrices of 
the Ginibre ensemble. 

Proposition 3.5 Let M be apxp random matrix with complex entries whose density with respect 
to A3.3\) is f(xi(M), ■ ■ ■ ,Xp(M)) where Xi(M), ■ ■ ■ ,x p (M) are the (positive) singular values, and 
f is a symmetric function. Then, the random matrices H = M*M and U = (M*M) 1 M are 
independent, U is Haar distributed in U(p) and the density of H £ <S+(C) with respect to \2.3\) 
is proportional to /(Ai(if), • • • , X P (H)) where Ai(if), • • • , \ P (H) are the eigenvalues of H. 

We are now in the position to give our first limit theorem in the trigonometrical case. 

Theorem 3.6 Let X n ~ U(M.J l ), A n = ^ n '(X n ) and denote the projection onto the first k 
coordinates (k is fixed). Then for n — > oo the weak convergence 

(3.15) v^An ^ 

holds, where Qk = (Gi , . . . , Gk) andGi, . . . ,Gk are complex iid random matrices of the Ginibre 



complex ensemble (see lGinibrd u 96a) ). or, in other words, having density 



(3.16) s(G) = 7r^ 2 ex P H|G|| 2 

with respect to $3. 3\) . 
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As a consequence, u sing the Taylor expansion of Lemma 13.21 and the 5-method (see for example 



van der Vaart 



(119981 )). we obtain a weak convergence theorem for the rescaled random trigono- 



metric moments. This is the subject of the next corollary. 

Corollary 3.7 Let X n ~ U(M^) and Xjj denote the projection onto the first k coordinates (k is 
fixed). Then when n — > oo 

(3.17) V^K ^ $k, 

(here Qk is as in Theorem \3.6]) . 

3.3 Large deviations in the trigonometrical case 

Our final results concern LDPs for random moments and matrix measures on the unit circle. The 



large deviations in the scalar trigonometrical case are due to iLozada-Changl ( 120051 ) Theorems 4.2 
and 4.4. Nevertheless, in that paper, there was a mistake in the computation of the Jacobian. A 
power 2 is missing. 

The proof of the next Corollary follows directly from part (ii) of Theorem 12.41 (applying the 
contraction principle). We again use the equality A k = VB^ 2 , where B k ~ Beta p (p, 2p(n — k)+p) 
and V is Haar distributed on the unitary group. By Lemma 13.31 the canonical moments are 
independent, giving the final form of the rate function. 

Corollary 3.8 Let X„ ~ U^Ai^), A n = ip^(X. n ) and denote the projection onto the first k 
coordinates (k is fixed). Then Ajj satisfies a LDP with good rate function 

k 

-2p log det (J p -Z*Zi), ifZe Int D* 



(3.18) X A (Z) =X A (Z 1 ,...,Z fc ) 



i=l 

oo otherwise. 



Another application of the contraction principle for the mapping yields the following LDP 
for the trigonometric moments. 

Corollary 3.9 Let X n ~ and Xjj denote the projection onto the first k coordinates (k is 

fixed). Then X^ satisfies a LDP with good rate junction 

\-2p\og / e ^ Tfc \ , ifX e Int Ml 

(3.19) x r (x)=x r (r 1 ,...,r fc )= P K detCZU)' 

I oo otherwise. 
Here, T k denotes the block Toeplitz matrix \3.4\) defined by (T±, . . . , Tk). 



15 



Finally we state a LDP for a sequence of random matrix measures on T. For every n, let 
denote a probability measure on the set V{T) such that the pushforward by the mapping 

(i e v(T) ^ xm = (ri(/x), . . . , vm) e Ml 

is U(Ml). 

Theorem 3.10 The sequence (Q n )n satisfies a LDP in Mi (V(T)) with good rate function 

-- / logdet(W (6))d6, if detW{9) ^ a.e., 
n Jt 

oo otherwise, 



(3.20) X T (/x) 



where dfx(6) = W(8)^+dfi s (9) is the Lebesgue decomposition of /i with respect to y~I p as matricial 
measures on T. 

The proof is very similar to that one of Theorem 12.81 and therefore omitted. 



Remark 3.11 1. For p = 1 the rate function is also 

fdB 
V2^ 



(3.21) X T (/x) =2/C — . 



It is the content of Theorem 4-4 ^ n \Lozada- Chand ((200d) but a factor 2 was missing in that 



paper, owing to a mistake in the Jacobian (7.2). 

2. As in Remark \2. 91 we see, from Theorem \3.10\ and Corollary \3. 91 together with the contraction 
principle, the following identity of rate functions. For = (ri, . . . , Tk) E Int AiJ. we have 

(3.22) X r (X fc ) = -2plog d f { J k) =inf -P [ \ogdet(W(6))d6, 

det(T fc _i) c(x fe ) 7r J T 



where 

(3.23) C(X k ) = {// E V(T) 

and W is defined as in Theorem \3.10[ 



e^d^id) = T v j = 1, 
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4 Application: Random Caratheodory and Schur matrix 
functions 



In the above Theorem 13.101 we studied a family of random measures. Since the truncated 
trigonometrical moment problem is closely connected to the Caratheodory problem, which is 
itself connected to the Schur problem, it may be natural to look at the corresponding random 
functions. In this section we study the impact of uniform sampling on the space of Taylor 



Damanik et al. 



coeffic ient s of these funct i ons. W e first give the framework, which can be seen in 
( 2008 ) or Dubovoj et al.l ( 1992 ) and then we give our results. It seems to be new, even in the 
scalar case. 



4.1 Caratheodory and Schur matrix- valued functions 

As before, let p be a given positive integer. By a C pxp -valued Caratheodory matrix function 
F(z), one means a p x p matrix- valued function which is holomorphic in D, has a nonnegative 
real part there 

F^{z) = \{F(z) +F{z)*) > 0, zeB, 

and such that .F(O) = I p . We use the notation C p to designate the class of such C pxp -valued 
Caratheodory matrix functions. We also define the class & p of C pxp -matrix valued functions / 
analytic in D and contractive there, i.e. such that f(z) G D p for z G D , which are called matrix 
valued Schur functions. 

The correspondence 

(4.1) F(z) = (I p + zf(z))(I p -zf(z)r l and f(z) = z~ 1 (F(z) — I p )(F(z) + Ip)' 1 
is one-to-one between C p and & p . Any F G C p has a representation 

F(z)= / -^d^ e ), ,GD, 
j j e — z 

for a unique \x G V(T). Any F G C p has a finite radial limit lim^i F(re 10 ) =: F(e ie ) for almost 
every 9. The corresponding value of / in such a point e id will be denoted by f(e l9 ). If 

diM(6) = W(0)— + d» a (d) 
is the Lebesgue decomposition of /x one has the identity 

(4.2) W(9) = FV 6 ) = (I P ~ e- ie f{j e y)-\l p - f (e i0 )* f (e i6 ))(I p - e ie f{e i6 ))- x 
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a.e. and for a.e. 0, det W{6) ^ iff f(e ie )*f(e ie ) < 1 (Prop. 3.16 in lDamanik et all ( j2008h ). 
The Taylor expansion of F is given by 



F(z)=I p + 2Y,C k (F); 



k=l 



where the coefficients are the conjugate trigonometric moments of the matrix measure /x associated 
to F, i.e. 

-ik6. 



Ck (F)= / e~^d^e) = n- 
Jt 

The classical Caratheodory problem is to find F G C p such that the first n Taylor coefficients 
coincide with given p x p matrices C±, . . . ,C n . It is clearly equivalent to the truncated moment 
problem. 

Each Schur function in & p is associated to a matrix measure /i G V(T), hence to the sequence 



of its canonical moments (Ak)k>i- For every j > 1, let fj be the Schur 
to the shifted sequence (Ak)k>j+i, and set f = f. From Theorem 3.19 of 
we have the recursive relations: 



unction correspondin g 



Damanik et al. 



f l2008h 



L 

k ■ 



f k {z) = z~\BK)- 1 [f k _ l {z)-Ai][I p -A k f k „ l {z)]- l B i 
} k {z) = {Bf +i r 1 [zf k+l {z)+Al +l ][I p + zA k+l f l 



fc+U B k+1 , 



(4.3) 
(4.4) 

where 

(4-5) Bf := [I p - A* k A k ] 1/2 , B L k := [I p - A k A* k ] 1/2 . 

The Taylor expansion of / is 

oo 

(4.6) f(z) = J2G k (f)z k . 

o 

The Schur problem is to find a Schur function / G & p such that the first n Taylor coefficients 
coincide with given numbers Go, . . . , G n -i- A solution exists if and only if the block matrix 



/ Go 








• °\ 


G x 


Go 





. 


G 2 




Go 


. 


\G n -i 


G„_2 


G n _3 . 


• G Q ) 
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is contractive, i.e. if it satisfies GG* < I np (see 



Dubovoj et al 



(11992( 1. Theorem 3.1.1). The set 



^f:={(Go(/),"- 



G n _i(/));/G6 p } 



is a relatively compact subset of (C pxp ) n . 

In both problems, the system of canonical moments (alias Verblunsky coefficients, alias Schur 
coefficients) plays a prominent role. In Section 13.31 we saw that the dependence between the 
moments (hence the C^'s) and the canonical moments is triangular. The relation between the 
Taylor coeffcients of a Schur function and its Schur coefficients (i.e. the canonical moments of the 
associated measure) is also triangular. We postpone the presentation of this point in the proof 
of Theorem 14.11 

4.2 Randomization. Large deviations 

For every n let P^ denote a probability measure on the set C p such that the pushforward by the 
mapping 



is U(Ain). Let also P^ denote a probability measure on the set & p such that the pushforward by 
the mapping 



F e C p h. C n (F) = (d(F), C n {F)) e M 



I 



n 



f G & p i— >■ G n (f) := (G (/), • • • , C n _i(/)) G ,SC 



One gets the following LDP for matrix valued Caratheodory and Schur functions. 



Theorem 4.1 The sequence (P^) n satisfies a LDP in M.i(C p ) with good rate function 



(4.7) 





otherwise. 



) ^ a.e., 



The sequence (P^) n satisfies a LDP in Mi((5 p ) with good rate function 





otherwise. 



19 



Remark 4.2 Behind Theorem \3.10\ and Theorem \4-l\ (and as will be seen in the proofs), there 



is a triple identity, which holds true in the generic case: 

de 



Vlogdet^-A^;) = / \ogdetW{6)^= ( logdetF R (e 
(4-9) = /logdet(J p -/(e^)7(e ie ))^, 

.It 27T 



2tt 



say 

(1) = (2) = (3) = (4). 

Equal ity (1) = (2) is Szegd's Theorem for matrix-valued measures (see Theorem 2.13.5 in 



$200b\) ). and (1) = (4) is the matricial version of Boyd's theorem (see 2.7.7 o nSimon (120031) in 



Simon 



the scalar case). 



5 Proofs 

5 . 1 Proof of Theorem Q 

If X is Beta p (a,/3) distributed, then 

x ( i (w l + w 2 y 1/2 w 1 (w 1 + w 2 y 1/2 

where W\ ~ W p {pi) and W 2 ~ W p (j3) are independent and Wishart distributed. 
For (z), we choose a = (3 = a n and observe that 

x n - \h = \ {Wi + w 2 y 1/2 [(wi - aj p ) + (a n i p - w 2 )\ (wi + w 2 y 1/2 

then we apply Proposition 16.11 (i) and (ii). 

For (ii), it is enough to take a = c and (3 = a n and apply Proposition 16.11 (i). □ 

5.2 Proof of Theorem 12^1 

We give a proof only for a n = an. 

To prove (i) let B n ~ Beta p (an, an), then again the following equality in distribution holds 

(2n \ -V 2 / n \ / 2n \ _1 / 2 

5>J fewij ($> 



20 



where the random variables are independent and W p (a) distributed, (see e.g. iPillai and Jouris 
( 1971 )). By Proposition 16.21 each component \Q , Vn of the vector 



satisfies a LDP with good rate function A* given by (16. 2h . 

The independence of the random variables W\ now yields a LDP for (Vd , Vn ) with good rate 
function A*(X) +A*(Y). By the contraction principle and equality (15.1 ft the random variable B n 
satisfies a LDP on (0 p , I p ) with good rate function 

1(Z) =inf (A*(X) + A*(y)) 

= inf (tr(X + Y)-a log det(Xy) - 2pa + 2pa log a) , 
where the infimum is taken over the set 

Z = {(X,Y) eS+(C) 2 | Z= (X ^Yy^xix + Y)- 1 ' 2 } . 
On Z we have det(Xy) = det(Z(/ p — Z) det(X + Y) 2 ) and we can write the rate function as 

X(Z) = -alogdet (Z(I P - Z)) - 2pa + 2paloga + inf (tr(X + Y) — 2alogdet(X + Y)) . 
Appealing to (lfT4l) with L = (2a)- 1 (X + Y), we see that 

X(Z) = -alogdet(Z(/ p - Z)) - 2palog2. 

To prove (ii) let B n ~ Beta p (c, an). Then we have 



B, 



where X ~ W p (c), (H / j) i= i v .. jn are iid W p (a) distributed and X and (Wi)i=i t _ >n are independent. 
By Propositions 16.21 and 16.31 we get f° r f^; ^ 5_)iLi ^) a LDP with rate function the sum of rate 
functions and by the contraction principle, we get a LDP with rate function 

X(Z) = inf (trX + trF — a log det Y — ap + ap log a) , 

where Z is as in the proof of Theorem 12.41 (i). On Z we have det(Y) = det(X + Y) det(/ p — Z), 
hence 

trX + try - a log det Y = tr(X + Y) - alogdet(X + Y) - a log det (I p - Z) 
and the infimum is achieved for (X + Y) = al p by (16 .4p . This completes the proof. □ 
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5.3 Proof of Theorem Q 



We follow here the proof given in iGamboa and Lozada-Changl (120041 ) concerning the scalar case. 
Let P n be the probability measure on the infinite dimensional moment space 

A*M = {s = (S' 1 ,S 2 ,...) | Sj = J\>dn(x), ^P([0,1])} 

induced by the bijection S H- /is- Now if denotes the canonical projection — >■ Adf' 1 ^, 
then the measure P n o (n^°) _1 is the law of Sjj. n) . Therefore, Corollary O yields a LDP for the 
sequence ^P n , o (n^ )^ 1 j with speed n and good rate functk 



cion 



-plogdet(S'+ +1 - S^ +1 ) - 2kp 2 \og2. 



By Dawson-Gartner's Theorem (see lDembo and Zeitounil (119981 )) the sequence F n satisfies a LDP 
with good rate function 

X(S) = supX fc (Sfc). 

fcGN 

It remains to calculate the right hand side of the last equality, which is given by 

sup-plog (4^det(^ + +1 -^ + i))- 

fceN 

Let fi denote a matrix measure corresponding to the sequence and let fx denote the image 
measure on [—1,1] obtained from ji by the affine transformation x i— > 2(x — |). Since canon- 
ical moments ar e invar iant under affine transformations, i.e., Ui(ii) = Ui(jl) (see for example 
Dette and Nagell (120 101 ). Lemma 3.1), we have 



det(5fci(M) - S^M) = Hdet(Ui(ri - U?(li)) = n<kt(*W) - Ufffl), 

i=i i=i 

where the first identity is again 02.18p . Now denote by \ic the symmetric matrix measure on T 
associated with jl, that is 

(5.2) J' f(x)dfl(x) = ^ f(cos(e))dLi c (e). 

The canonical mom e nts U j(u) are related to the canonical moments Ai(fic) by the relation (see 
Dette and Wagenerl (120 lph 



Ui(p) = -(Mi*c) + i P ). 
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This gives for the range 

k 

det(S+ +1 (/i) - S^iM) = II 4 " Pdet ^ - Mucf). 

i=i 

Since < det(/ p — Ai(fi c ) 2 ) < 1> the sequence Z fc (Sfc) is increasing in fc which yields 

sup-plog (4*> k det(S+ +1 - S^ +1 )) = lim -plog f J] det(J p - A^) 2 ) ) . 

Then the Szego's Theorem for Matrix- Valued Measures (Theorem 2.13.5 in Simonl ( 2005 )) yields 

/ ft 

Z(S(//)) = lim -plog ( TTdet(/ p - A n (fi c ) : 



2vr 



,i=l 

logdet W(0)d0, 



where djXciQ) — W(6)tt + <^A*s is the Lebesgue decomposition of pic- Since \ic is symmetric, 
is an even function 

X(S(/i)) = - - / logdet 
Jo 

which, after projection on [0, 1] yields 

X(S(/i)) = -- logdet y(ar) 



7 vM 1 - x) 

where V(rr) = jy(arccos(2a; — 1)) is the Radon-Nikodym derivative of \i with respect to the 
arcsine matricial measure. The result follows from the contraction principle and the continuity 
of the mapping S H- /is- □ 



5.4 Proof of Lemma 13.21 



First we recall the notion of Frechet differentiability (see for example ICartanl (119671 )). 
Let U be an open subset of a complex Banach space X and $ a continuous map from U to a 
complex Banach space Y. The map $ is called differentiable at U G U, if there exists a bounded 
linear operator L from X to F such that 

\<S>(U + V)-<S>(U)-LV\ 



lim 



|V| 



0. 



We denote L by D${U) and call it differential of $ at £/. 
For this notion of differentiability we have the following rules : 
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[chain — rule) Let Z be a Banach space, V be an open subset of Y and ^ : V — > Z be 
a continuous mapping from V to Z. If $(t/) G V, if $ is differentiable at U and if \& is 
differentiable at $(?7) then ^ o $ is differentiable at £/ and 



(5.3) D(Vo$)(U)=DV($(U))oD$(U). 

• (product — rule) If we have a multiplicative structure on Y and if <3> and ^ are continuous 
maps from 14 to Y, both differentiable at Uq then the map : U ^ <&(U) ■ W(U) is 
differentiable at Uq and for every V 

(5.4) D(m)(U )V = [D<f>(U )V] ■ V(U Q ) + $(C/ ) ■ [D*(C^ )V] • 

We note that the mapping M h- >• M 1//2 is differentiable at J p . Further, the action of the differential 
at that point is the multiplication by |. Theorem 13.21 now follows using the above mentioned 
rules and the following lemma. 



Lemma 5.1 Let (Ti, . . . , T n ) G Int M^- F° r ^ e ma-trices L n and R n defined in ( fff. 6]) and ( 3.1 ), 
respectively, the following recursions hold 

(5.5) L n = L][\ (I p - A n A* n ) h\l\ and R n = tf£ x {I p - A* n A n ) R][\. 



Proof: We only show the result for L n . For R p: the proof is left for the reader. 



Here we use the notation of 



Dette and Wagenerl (12010 ) 



polynomials. Using the Szego recursion (compare e.g. 
that Ln. 1 ^ 2 is Hermitian we obtain 



jet 6% and 6?' be the orthonormal matrix 



Simon! (120051 ) section 2.13) and the fact 



= (L-^Ll/*^ + A n+1 ^, L^ 2 LH^ +1 + A n+1 ^) } 

_ r-l/2r r-1/2 , A A * 

Indeed the definition of the inner products directly yields 
The assertion of the Lemma follows. 



□ 
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In the following we will differentiate mappings from C npxp to C pxp . We have from the definition 
of canonical moments 

(5.6) T k = Ll^AkR 1 ^ + Af fc _i (1 < k < n) , 

where the matrices L k -i, Rk-i and M k _\ are defined in (13.61) to (13.81) . The differentiability 
of A n \-> h]l\A k R]l\ at p n) = (0 P , . . . , P ) G C npxp follows obviously using the product rule. 
Indeed, first the linear map A n h-> A k is obviously different iable in P ™\ The action of the 
differential is the multiplication by the map itself. The differentiability of A n i— > L k and A n i— y R k 
can be established using induction on k and Lemma 15.11 together with chain and product rules. 
Again by induction one obtains L k (0^) = R k (0^) = L p . Now the product rule yields, for every 
V e C p 

D(Ll^A k Rl^)(0^)V = [DL l k ^)V] ■ A k {0<f) • ^(0^) + L^Of) • A k V R^O^) 

+ (Ll\(0P) ■ A k (0P) ■ [DRl\(0^)V] 
= A k V. 

It remains to show that M k -i = o(||A n ||) for k — 1, . . . , n. It is done by induction with respect to 
k together with an appeal to the continuity of the inversion at Ir k -i) p . This yields the conclusion 
of Lemma 13.21 □ 



5.5 Proof of Lemma 13.31 



We have by definition of the canonical moments that A k depends only on F\, . . . , F k so that the 
Jacobian of ip^ 1 ' is the product of the Jacobians of (ri, . . . , T^) i-> A k (k — 1, . . . , n). As 



A k = L-^{V k - M k ^)R-t[ 



1/2 



and because L k _i, R k -i and M k _i are independent of T k , Theorem 3.2 from iMathail (119971 ) gives 
the following Jacobian J k for the mapping T k y A k . 

A = det (i-f (IT* )")" det {Kg (i£f )')' 
= det(L l _ 1 )-'det(B i _ 1 )- ! ', 

where the last equality follows because L k _i and R k -i are Hermitian. From Lemma I5TT1 we obtain 



k-l 



k-l 



det(L fc _ 1 )- p det(i? fc _ i ; 



] det(/ p - A*A 3 Y P det(/ p - A*Aj)~ p = J] det(/ p - AjA* 



-2p 
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Consequently, the Jacobian of ip^ is the product 

n k—l 



n-l 



I det (I p - qAjf* = J] det (I P - A* k A 



\2p(n—k) 



fc=lj=l k=l 

This yields exactly the assertion of the lemma. 



□ 



5.6 Proof of Proposition 13.51 

The proof of this proposition uses the following lemma. 

Lemma 5.2 Let A be a p x p matrix of full rank and A = UH 1 ^ 2 its polar decomposition with 
H = A* A e S P (C) and U = A(A*A)~ 1/2 e U(p). If A is random and if 



(5.7) 



W G U(p) A ( = ] VA 



then U and H are independent, and U is Haar distributed. 



Proof of Lemma 15.21 

We have for all bounded measurable functions f±, f2 



(5.8) 
(5.9) 

(5.10) 

(5.11) 

(5.12) 



EfaMMH)) = E/i (A(A* A)~ 1/2 ) f 2 ((A* A)) 
= E/i (VA(A*A)~ 1/2 ) f % ((A* A)) 

[E/i (V A(A* A)~ 1/2 ) f 2 ((A* A))] d Haar (V) 



U(p) 



E 
E 



U(p) 



fx (VA(A*A)~ 1/2 ) d Haar (V) 



h {{A* A)) 



fx(V)d Haar (V) 



Uv( P ) 
fx(V)d Haar (V) 



f2((A*A)) 



U(p) 



E(/ 2 (CAM))) 



where in (15 .81) we take into account the invariance by left multiplication, in (15. 9ft the fact that 
V is arbitrary in U(p), in (I5.10p Fubini's theorem, and in ( 15. lip the invariance of Haar by right 
multiplication. □ 

Proof of Proposition 13.51 

The assumption (15.71) is trivially verified since VA and A have the same singular values. It 
remains to determine the distribution of H = M*M. By a simple application of Proposition 
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4.1.3 of lAnderson et al.l ( 120101 ). we see that the singular values of M have on (0, oo) p a joint 
density proportional to 

\A(xl, ■ ■ ■ ,x 2 p )\ 2 f(x\, ■ ■ ■ ,xj)(zi ...x P ) 

where A is the Vandermonde function. This implies directly that the eigenvalues of H have on 
(0, oo) p a joint density proportional to 

|A(A l5 -- - ,A p )| 2 /(Ai,-- - ,A P ). 
Now it is easy to lift to the matrix H by Proposition 4.1.1 of 



Anderson et al 



(12010h 



□ 



Proof of Theorem 13.41 If Af. has density f(Ah) it fulfills the assumptions of Proposition 13.51 
with 

/(Ai,---,A p ) = ^ y n(l-A J ) 2 ^- fc ) 
and the density of is proportional to 

det(/ p - B k ) 2p{n - k) . 

This expression fits with ( 12 . 1 j) with a = p and b = 2p(n — k) + p. □ 



5.7 Proof of Theorem 13^1 



One proof of Theorem 13.61 directly follows from two applications of Theorem 13.41 together with 
Lemma 13 .3[ Theorem 12.21 and the continuous mapping theorem. We give a second proof here. 



5.7.1 Alternative proof: Gaussian approximation 

We use two clever results. The first one will give a representation of the law of A k . 



Theorem 5.3 ( jCollinsI ( 120051 ) Theorem 5.1 or iForrester and Krishnapurl ( 120091 )) The 

top p x p sub-block of a Haar distributed matrix from U(p + q), where q > p, has a density in 3 p 
proportional to 

A h+ det (I p - AA*) q - p . 
The second one is the following "Borel theorem". 



Theorem 5.4 ( Uiang l ( 120051 ). Corollary 1) There exists two N x N random matrices Un 
( n i,j)i<i,j<N an d Yn = (lJi,j)i<i,j<N defined on the same probability space such that 
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i) Un is Haar distributed in U(iV) 

ii) all the yij, 1 < i, j < N are independent and standard complex gaussian distributed, 
lii) Form N = [N/(logN) 2 ] 



max |V7V7Tjj — y^j\ — > 

i<JV,j'<mjv 



in probability as N — >■ oo. 



From the above notation and Lemma |3~3| A k is distributed as the top px p sub-block of Iljv with 
N = 2p(n — k + 1). Up to a change of probability space we have then for i, j < p 



v / 2p(n - k + l)(A k )ij - y id 
in probability as n — > oo, which leads easily to the conclusion since k is fixed. 



□ 



5.8 Proof of Corollary I3~9l 

By the contraction principle and Corollary I3.8j (X^) n satisfies a LDP with good rate function 

>ELi logdet(J p - A*Ai), if {T x , . . . , T k ) E Int M T k , 



2r(ri, • • • , Ta 



otherwise, 



where (A\, . . . , A k ) = ^^ (Ti Tk)- An applicat ion of the formula for determinants of block 

matrices (see for example iHorn and Johnson! (119851 ) ) yields 



det(T fc ) = det(T fc _i) det(R k ) = det(T fc _i) det(L fc ), 
because L k and R k are Schur complements in T k . From Lemma [5.11 we obtain 

k 

det(R k ) = Y[det(I p -A*A i ) 



■i=i 



and so 



k 

^logdet(/ p -A^) = log^^T. 

det(Tfc_i) 



which is the assertion of Corollary 13.91 



□ 
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5.9 Proof of Theorem 14.11 



For (P£) (Caratheodory problem), the assertion is a consequence of Theorem 13. 10} the contrac- 
tion principle and (14. 2p . Recall the main point: under U^Ai^), the variables Al,--- ,A n are 
independent, and A k has a density proportional to det [I p — A*Aj) 

For (P*) (Schur problem), we first remark from ( 14.31) that the mapping (G (f), ■ ■ ■ , G n _i(/)) (->• 
(Ai, ■ ■ • , A n ) is triangular, i.e . that Gk( f) depends only on A±, ■ ■ • , A k+ i. Let us give details. In 



the scalar case, it is 1.3.48 in 



Simon! (120051 ) and we follow the same scheme, up to change due to 



non commutativity. Relation (I4.4p for k = implies 

f(z)(B'y l [I p + zAMz)] = {B*)- l [zh{z) + A{] 
Identifying the powers of z n on both sides yields 
G (f) = {B*)- l A\BL 



n-l 



G n (f) = {B^)- l G n ^{h)B^ ~ G (f)(B^)- 1 A 1 G n ^ 1 (f 1 ) - ^G i (/)( J B 1 i )- 1 A 1 G , „_ 1 _,(/ 1 ; 



Lemma 1.3 in 



Damanik et al 



(I2008f ) (see also formula (2.13.52) in lSimonl ( 120051 )) says that 



A)B] = BfA) 



for every j > 1 so that we get Gq(J) = Ai and identifying the powers of z n on both sides yields: 
G (f) = Al 



n-l 



(5.13) G n (f) = (B^Gn^if^-^Gjif^B^A.G^ih) (n > 1) . 

3=0 



Induction on n leads to 

(5.14) 
where 



G n (f) 



V n A* n+1 W n 



polynomial in (Ax, A*, ■ • • ,A n , A*) . 



V n = B?B? ■■■B«,W n = B^Btx ■■■Bf. 

From this relation, we see that, if we froze A±, ■ ■ ■ ,A n the Jacobian of the mapping G n (f) >->■ A n+ i 
is (Theorem 3.2 of Mathail (119971 )) 



det(KKT)l P l det(W n W:)\v = il[det(/ p - A* k A k )] 



2p 



k=l 
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Like in the proof of Lemma 13.3} it turns out that the Jacobian of the mapping 



(Go(f),--- ,G n ^(f))^(A u --- ,A n ) 

is then 

71-1 

l[det(I p - A* k A k )^- k K 
k=i 

We conclude that the distribution of (A±, ■ ■ ■ ,A n ) under P* is the same as the distribution of 
(Ai, ■ ■ ■ ,A n ) under P^j. Applying again the contraction principle, we see that (P*) satisfies a 
LDP with good rate function 

W) = "- / logdetW(0)d0 

where W is related to \i the underlying matrix measure. To have a rate function depending 
explicitly on /, we go back to the correspondence (14.21) between W and / so that 

logdet W{9) = logdet(J p - f{e m Yf{e w )) - 2 log | det(J p - e ie f(e i6 ))\ 

and apply Jensen's formula to the function det(/ p — zf(z)). This yields ( 14. 8p . □ 

6 Appendix: some properties of the Wishart distribution 

For a > 0, the Laplace transform of the complex Wishart distribution W p (a) is given for K £ S p 
by 

(6.1) A(K) = logE [ e tr(w) ] = -alogdet(/ p - K) 

if K < I p and infinite otherwise. From the divisibility of the family of Wishart distributions 
(indexed by a), we deduce the following easy results (law of large numbers and CLT). 

Proposition 6.1 As a n — > oo we have for W n ~ W p (a n ) 
(i) lim — W n = I p (in probability) , 

n— >oo a n 

(ii) (a n )" 1/2 (W n - a n I p ) A GUE P 
Since the following large deviations result is not so obvious, we give a proof. 
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Proposition 6.2 For fixed p and a > 0, if the variables Xk, k > 1 are independent and W p (a) 
distributed, then \{X\ + ■ ■ ■ + X n ) satisfies a LDP in S p (C) with good rate function 



(6.2) A*(X) 



trX — a log det X — ap(l — log a) if det X > 0, 
oo otherwise. 



Proof: The multidimensional Cramer theorem gives a LDP with good rate function 

(6.3) A*(X)= sup tr(KX) - A(K). 

Kes P (c) 

We first give a non variational expression of A*(X). 

If detX = 0, for every n we choose K n G S P (C) such that K n x = for x in the range of X and 
such that the restriction of K n to the kernel of X is —nl^, where d > 1 is the dimension of this 
kernel. We have tr(K n X) — A(K n ) = ad\og(n + 1) and the supremum in (16.31) is infinite. 

If det X ^ 0, make the variable change K = I p — aX~ x L and observe that 

(6.4) log det L < tr(L - I p ) 

with equality only at L = I p . □ 

At last, we have another LDP for rescaled Wishart distributions. Its proof is left to the reader 
and uses directly the density (12.131) . 

Proposition 6.3 Let p and a be fixed. If X is W p (a) distributed then X/n satisfies a LDP in 
Sp(C) with good rate function 

(6.5) l s (X)=trX. 
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